Picture for Zirui Song

Zirui Song

Token Predictors Are Not Planners: Building Physically Grounded Causal Reasoners

Add code
Jun 01, 2026
Viaarxiv icon

TextAlign: Preference Alignment for Text Rendering with Hierarchical Rewards

Add code
May 19, 2026
Viaarxiv icon

The Cylindrical Representation Hypothesis for Language Model Steering

Add code
May 03, 2026
Viaarxiv icon

ServImage: An Image Generation and Editing Benchmark from Real-world Commercial Imaging Services

Add code
Apr 27, 2026
Viaarxiv icon

FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation

Add code
Mar 30, 2026
Viaarxiv icon

When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection

Add code
Oct 14, 2025
Figure 1 for When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
Figure 2 for When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
Figure 3 for When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
Figure 4 for When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
Viaarxiv icon

SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models

Add code
May 29, 2025
Figure 1 for SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models
Figure 2 for SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models
Figure 3 for SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models
Figure 4 for SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models
Viaarxiv icon

Divide-Fuse-Conquer: Eliciting "Aha Moments" in Multi-Scenario Games

Add code
May 22, 2025
Viaarxiv icon

ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models

Add code
May 22, 2025
Viaarxiv icon

Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models

Add code
May 21, 2025
Figure 1 for Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
Figure 2 for Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
Figure 3 for Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
Figure 4 for Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
Viaarxiv icon